Search CORE

12 research outputs found

Efficient Belief Propagation for Perception and Manipulation in Clutter

Author: Desingh Karthik
Publication venue
Publication date: 01/01/2020
Field of study

Autonomous service robots are required to perform tasks in common human indoor environments. To achieve goals associated with these tasks, the robot should continually perceive, reason its environment, and plan to manipulate objects, which we term as goal-directed manipulation. Perception remains the most challenging aspect of all stages, as common indoor environments typically pose problems in recognizing objects under inherent occlusions with physical interactions among themselves. Despite recent progress in the field of robot perception, accommodating perceptual uncertainty due to partial observations remains challenging and needs to be addressed to achieve the desired autonomy. In this dissertation, we address the problem of perception under uncertainty for robot manipulation in cluttered environments using generative inference methods. Specifically, we aim to enable robots to perceive partially observable environments by maintaining an approximate probability distribution as a belief over possible scene hypotheses. This belief representation captures uncertainty resulting from inter-object occlusions and physical interactions, which are inherently present in clutterred indoor environments. The research efforts presented in this thesis are towards developing appropriate state representations and inference techniques to generate and maintain such belief over contextually plausible scene states. We focus on providing the following features to generative inference while addressing the challenges due to occlusions: 1) generating and maintaining plausible scene hypotheses, 2) reducing the inference search space that typically grows exponentially with respect to the number of objects in a scene, 3) preserving scene hypotheses over continual observations. To generate and maintain plausible scene hypotheses, we propose physics informed scene estimation methods that combine a Newtonian physics engine within a particle based generative inference framework. The proposed variants of our method with and without a Monte Carlo step showed promising results on generating and maintaining plausible hypotheses under complete occlusions. We show that estimating such scenarios would not be possible by the commonly adopted 3D registration methods without the notion of a physical context that our method provides. To scale up the context informed inference to accommodate a larger number of objects, we describe a factorization of scene state into object and object-parts to perform collaborative particle-based inference. This resulted in the Pull Message Passing for Nonparametric Belief Propagation (PMPNBP) algorithm that caters to the demands of the high-dimensional multimodal nature of cluttered scenes while being computationally tractable. We demonstrate that PMPNBP is orders of magnitude faster than the state-of-the-art Nonparametric Belief Propagation method. Additionally, we show that PMPNBP successfully estimates poses of articulated objects under various simulated occlusion scenarios. To extend our PMPNBP algorithm for tracking object states over continuous observations, we explore ways to propose and preserve hypotheses effectively over time. This resulted in an augmentation-selection method, where hypotheses are drawn from various proposals followed by the selection of a subset using PMPNBP that explained the current state of the objects. We discuss and analyze our augmentation-selection method with its counterparts in belief propagation literature. Furthermore, we develop an inference pipeline for pose estimation and tracking of articulated objects in clutter. In this pipeline, the message passing module with the augmentation-selection method is informed by segmentation heatmaps from a trained neural network. In our experiments, we show that our proposed pipeline can effectively maintain belief and track articulated objects over a sequence of observations under occlusion.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/163159/1/kdesingh_1.pd

Deep Blue Documents at the University of Michigan

TSBP: Tangent Space Belief Propagation for Manifold Learning

Author: Cohn Thomas
Desingh Karthik
Jenkins Odest Chadwicke
Zeng Zhen
Publication venue
Publication date: 01/08/2020
Field of study

We present Tangent Space Belief Propagation (TSBP) as a method for graph denoising to improve the robustness of manifold learning algorithms. Dimension reduction by manifold learning relies heavily on the accurate selection of nearest neighbors, which has proven an open problem for sparse and noisy datasets. TSBP uses global nonparametric belief propagation to accurately estimate the tangent spaces of the underlying manifold at each data point. Edges of the neighborhood graph that deviate from the tangent spaces are then removed. The resulting denoised graph can then be embedded into a lower-dimensional space using methods from existing manifold learning algorithms. Artificially generated manifold data, simulated sensor data from a mobile robot, and high dimensional tactile sensory data are used to demonstrate the efficacy of our TSBP method.http://deepblue.lib.umich.edu/bitstream/2027.42/167252/1/09166624-Thomas_Cohn.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/167252/2/iros_presentation-Thomas_Cohn.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/167252/3/complete_tsbp_codebase-TCohn.zi

Deep Blue Documents at the University of Michigan

Evaluating Robustness of Visual Representations for Object Assembly Task Requiring Spatio-Geometrical Reasoning

Author: Desingh Karthik
Diaz Ryan
Ku Chahyon
Winge Carl
Yuan Wentao
Publication venue
Publication date: 22/10/2023
Field of study

This paper primarily focuses on evaluating and benchmarking the robustness of visual representations in the context of object assembly tasks. Specifically, it investigates the alignment and insertion of objects with geometrical extrusions and intrusions, commonly referred to as a peg-in-hole task. The accuracy required to detect and orient the peg and the hole geometry in SE(3) space for successful assembly poses significant challenges. Addressing this, we employ a general framework in visuomotor policy learning that utilizes visual pretraining models as vision encoders. Our study investigates the robustness of this framework when applied to a dual-arm manipulation setup, specifically to the grasp variations. Our quantitative analysis shows that existing pretrained models fail to capture the essential visual features necessary for this task. However, a visual encoder trained from scratch consistently outperforms the frozen pretrained models. Moreover, we discuss rotation representations and associated loss functions that substantially improve policy learning. We present a novel task scenario designed to evaluate the progress in visuomotor policy learning, with a specific focus on improving the robustness of intricate assembly tasks that require both geometrical and spatial reasoning. Videos, additional experiments, dataset, and code are available at https://bit.ly/geometric-peg-in-hole

arXiv.org e-Print Archive

Perception for General-purpose Robot Manipulation

Author: Desingh Karthik
Publication venue: Association for the Advancement of Artificial Intelligence
Publication date: 06/09/2023
Field of study

To autonomously perform tasks, a robot should continually perceive the state of its environment, reason with the task at hand, plan and execute appropriate actions. In this pipeline, perception is largely unsolved and one of the more challenging problems. Common indoor environments typically pose two main problems: 1) inherent occlusions leading to unreliable observations of objects, and 2) the presence and involvement of a wide range of objects with varying physical and visual attributes (i.e., rigid, articulated, deformable, granular, transparent, etc.). Thus, we need algorithms that can accommodate perceptual uncertainty in the state estimation and generalize to a wide range of objects. Probabilistic inference methods have been highly suitable for modeling perceptual uncertainty, and data-driven approaches using deep learning techniques have shown promising advancements toward generalization. Perception for manipulation is a more intricate setting requiring the best from both worlds. My research aims to develop robot perception algorithms that can generalize over objects and tasks while accommodating perceptual uncertainty to support robust task execution in the real world. In this presentation, I will briefly highlight my research in these two research threads

Association for the Advancement of Artificial Intelligence: AAAI Publications

Physically Plausible Scene Estimation for Manipulation in Clutter

Author: Karthik Desingh
Odest Chadwicke
Reveret Lionel
Zhiqiang Sui
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/11/2016
Field of study

International audiencePerceiving object poses in a cluttered scene is a challenging problem because of the partial observations available to an embodied robot, where cluttered scenes are especially problematic. In addition to occlusions, cluttered scenes have various cases of uncertainty due to physical object interactions, such as touching, stacking and partial support. In this paper, we discuss these cases of physics-based uncertainty case by case and propose methods for physically-viable scene estimation. Specifically, we use Newtonian physical simulation to check the plausibility of hypotheses within generative probabilistic inference in relation to particle filtering, MCMC and an MCMC variant on particle filtering. Assuming that object geometries are known, we estimate the scene as a collection of object poses, and infer a distribution over the state space as well as the maximu likelihood estimate. We compare with ICP based approaches and present our results for scene estimation in isolated cases of physical object interaction as well as multiobject scenes such that manipulation of graspable objects can be performed with a PR2 robot

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server